Recognizing Extended Spatiotemporal Expressions by Actively Trained Average Perceptron Ensembles

نویسندگان

  • Wei Zhang
  • Yang Yu
  • Osho Gupta
  • Judith Gelernter
چکیده

Precise geocoding and time normalization for text requires that location and time phrases be identified. Many state-of-the-art geoparsers and temporal parsers suffer from low recall. Categories commonly missed by parsers are: nouns used in a nonspatiotemporal sense, adjectival and adverbial phrases, prepositional phrases, and numerical phrases. We collected and annotated data set by querying commercial web searches API with such spatiotemporal expressions as were missed by state-of-theart parsers. Due to the high cost of sentence annotation, active learning was used to label training data, and a new strategy was designed to better select training examples to reduce labeling cost. For the learning algorithm, we applied an average perceptron trained Featurized Hidden Markov Model (FHMM). Five FHMM instances were used to create an ensemble, with the output phrase selected by voting. Our ensemble model was tested on a range of sequential labeling tasks, and has shown competitive performance. Our contributions include (1) an new dataset annotated with named entities and expanded spatiotemporal expressions; (2) a comparison of inference algorithms for ensemble models showing the superior accuracy of Belief Propagation over Viterbi Decoding; (3) a new example re-weighting method for active ensemble learning that “memorizes” the latest examples trained; (4) a spatiotemporal parser that jointly recognizes expanded spatiotemporal expressions as well as named entities.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Variational Inference for Robust Sequential Learning of Multilayered Perceptron Neural Network

We derive a new sequential learning algorithm for Multilayered Perceptron (MLP) neural network robust to outliers. Presence of outliers in data results in failure of the model especially if data processing is performed on-line or in real time. Extended Kalman filter robust to outliers (EKF-OR) is probabilistic generative model in which measurement noise covariance is modeled as stochastic proce...

متن کامل

Weather analysis using ensemble of connectionist learning paradigms

This paper presents a comparative analysis of different connectionist and statistical models for forecasting the weather of Vancouver, Canada. For developing the models, one year’s data comprising of daily temperature and wind speed were used. A multi-layered perceptron network (MLPN) and an Elman recurrent neural network (ERNN) were trained using the one-step-secant and Levenberg–Marquardt alg...

متن کامل

Using neural network ensembles for bankruptcy prediction and credit scoring

Bankruptcy prediction and credit scoring have long been regarded as critical topics and have been studied extensively in the accounting and finance literature. Artificial intelligence and machine learning techniques have been used to solve these financial decision-making problems. The multilayer perceptron (MLP) network trained by the back-propagation learning algorithm is the mostly used techn...

متن کامل

Comparing Template-based, Feature-based and Supervised Classification of Facial Expressions from Static Images

We compare the performance and generalization capabilities of different low-dimensional representations for facial emotion classification from static face images showing happy, angry, sad, and neutral expressions. Three general strategies are compared: The first approach uses the average face for each class as a generic template and classifies the individual facial expressions according to the ...

متن کامل

Classifier ensembles for image identification using multi-objective Pareto features

In this paper we propose classifier ensembles that use multiple Pareto image features for invariant image identification. Different from traditional ensembles that focus on enhancing diversity by generating diverse base classifiers, the proposed method takes advantage of the diversity inherent in the Pareto features extracted using a multi-objective evolutionary Trace Transform algorithm. Two v...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1508.04525  شماره 

صفحات  -

تاریخ انتشار 2015